Add minimal harness for experimental model family#317613
Draft
bhavyaus wants to merge 6 commits into
Draft
Conversation
Introduces an opt-in 'minimal harness' for capable agentic models being evaluated against a stripped-back prompt + tool surface. Activated by setting family to 'experimental' in chat.modelCapabilityOverrides. - Add DefaultMinimalPrompt, DefaultMinimalReminderInstructions, and DefaultMinimalToolReferencesHint as a minimal prompt-tsx stack. - promptRegistry: when no resolver matches and family is experimental, return the minimal prompt/reminder/toolReferencesHint trio. Identity and safety rules remain on the existing defaults. - toolsService: restrict getEnabledTools for experimental endpoints to a small allowlist (terminal, read, edit, search). - Extract isMinimalHarnessFamily into chatModelCapabilities so the prompt registry and tools service share one definition.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an opt-in “minimal harness” path for endpoints whose family is overridden to experimental, providing a stripped-back agent system prompt and restricting the enabled tool surface to a small allowlist. This is integrated into the Copilot extension’s prompt resolution and tool enabling logic.
Changes:
- Introduces
isMinimalHarnessFamily()in endpoint capabilities to centralize the “experimental/minimal harness” family check. - Adds a new minimal agent prompt/reminder/tool-references hint and wires it as the fallback when no model-specific prompt resolver matches.
- Restricts
getEnabledTools()to a curated allowlist when the endpoint is in the minimal harness family. - Tightens
SearchSubagentToolhydration by enforcing read-only file access guards and avoiding disclosing external paths.
Show a summary per file
| File | Description |
|---|---|
| extensions/copilot/src/platform/endpoint/common/chatModelCapabilities.ts | Adds shared “minimal harness family” detection used by prompts/tools. |
| extensions/copilot/src/extension/tools/vscode-node/toolsService.ts | Filters enabled tools to an allowlist when the endpoint is “experimental”. |
| extensions/copilot/src/extension/tools/node/searchSubagentTool.ts | Adds file-access guards before hydration and drops external-path lines on failure. |
| extensions/copilot/src/extension/tools/node/test/searchSubagentTool.spec.ts | Updates tests for the new parseFinalAnswerAndHydrate signature. |
| extensions/copilot/src/extension/prompts/node/agent/promptRegistry.ts | Selects minimal prompt/reminder/tool-hint for “experimental” family when no resolver matches. |
| extensions/copilot/src/extension/prompts/node/agent/defaultMinimalPrompt.tsx | New minimal prompt + reminder + tool reference hint elements. |
Copilot's findings
Comments suppressed due to low confidence (1)
extensions/copilot/src/extension/tools/node/test/searchSubagentTool.spec.ts:198
- These lines are over-indented compared to the surrounding test body, which breaks the file’s formatting conventions (tabs/consistent indent). Please align the
const result = ...statement with the other statements in the test.
].join('\n');
const result = await tool['parseFinalAnswerAndHydrate'](response, '/workspace', undefined, notCancelled);
expect(result).toBe(response);
- Files reviewed: 6/6 changed files
- Comments generated: 3
| * tool schemas to convey the rest. | ||
| * | ||
| * Plumb this through from a model's `IAgentPrompt` resolver when you want to | ||
| * strip back scaffolding. Not auto-registered. |
Comment on lines
+216
to
+226
| // Resolve the candidate URI up front so we can reference it from both the | ||
| // try and the catch block (for the external-file check below). | ||
| const uri = (!path.isAbsolute(filePath) && cwd) | ||
| ? URI.joinPath(URI.file(cwd), filePath) | ||
| : URI.file(filePath); | ||
|
|
||
| try { | ||
| // For relative paths, immediately resolve against cwd. | ||
| // For absolute paths, use as-is and let openTextDocument throw if not found. | ||
| const uri = (!path.isAbsolute(filePath) && cwd) | ||
| ? URI.joinPath(URI.file(cwd), filePath) | ||
| : URI.file(filePath); | ||
| // Enforce read-only file access via shared toolUtils guards before hydrating. | ||
| await this.instantiationService.invokeFunction(accessor => | ||
| assertFileOkForTool(accessor, uri, this._inputContext, { readOnly: true, workingDirectory }) | ||
| ); |
Comment on lines
183
to
187
| ].join('\n'); | ||
|
|
||
| const result = await tool['parseFinalAnswerAndHydrate'](response, '/workspace', notCancelled); | ||
| const result = await tool['parseFinalAnswerAndHydrate'](response, '/workspace', undefined, notCancelled); | ||
|
|
||
| expect(result).toBe(response); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduces an opt-in minimal harness for capable agentic models being evaluated against a stripped-back prompt + tool surface. Activated by setting `family` to `experimental` in the `chat.modelCapabilityOverrides` setting:
```json
"chat.modelCapabilityOverrides": {
"": { "family": "experimental" }
}
```
What's in the minimal harness
Prompt (when no model-specific resolver matches and family is `experimental`):
Tools (`getEnabledTools` filters to this allowlist for experimental endpoints):
Everything else (subagents, todos, fetch, MCP/contributed tools, browser, notebooks, GitHub, replace_string, apply_patch, etc.) is filtered out regardless of request/model defaults.
Design notes
Files